Cache efficient implementation for block matrix operations
نویسندگان
چکیده
Efficiently manipulating and operating on block matrices can be beneficial in many applications, among others those involving iteratively solving nonlinear systems. These types of problems consist of repeatedly assembling and solving sparse linear systems. In the case of very large systems, without a careful manipulation of the corresponding matrices, solving can become very time consuming. This paper proposes a memory storage scheme convenient for both, numeric and structural matrix modification and, at the same time, allowing efficient arithmetic operation. This scheme was used in the implementation of a simple BLASlike library. The advantage of the new scheme is demonstrated through exhaustive tests on the popular University of Florida Sparse Matrix Collection. Furthermore, this library was used in solving several nonlinear graph optimization problems.
منابع مشابه
High-Performance Algorithms for Computing the Sign Function of Triangular Matrices
Algorithms and implementations for computing the sign function of a triangular matrix are fundamental building blocks in algorithms for computing the sign of arbitrary square real or complex matrices. We present novel recursive and cache efficient algorithms that are based on Higham’s stabilized specialization of Parlett’s substitution algorithm for computing the sign of a triangular matrix. We...
متن کاملIncreasing the Performance of the Jacobi-Davidson Method by Blocking
Block variants of the Jacobi-Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm when it comes to computing multiple or clustered eigenvalues. In practice, however, they are typically avoided because the total number of matrix-vector operations increases. In this paper we present the implementation of a block Jaco...
متن کاملCache Oblivious Matrix Operations Using Peano Curves
Algorithms are called cache oblivious, if they are designed to benefit from any kind of cache hierarchy—regardless of its size or number of cache levels. In linear algebra computations, block recursive approaches are a common approach that, by construction, lead to inherently local data access pattern, and thus to an overall good cache performance[3]. In this article, we present block recursive...
متن کاملA note on the O(n)-storage implementation of the GKO algorithm
We propose a new O(n)-space implementation of the GKO-Cauchy algorithm for the solution of linear systems with Cauchy-like matrix. Despite its slightly higher computational cost, this new algorithm makes a more efficient use of the processor cache memory. Thus, for matrices of size larger than n ≈ 500− 1000, it outperforms the existing algorithms. We present an applicative case of Cauchy-like m...
متن کاملCache Oblivious Dense and Sparse Matrix Multiplication Based on Peano Curves
Cache oblivious algorithms are designed to benefit from any existing cache hierarchy—regardless of cache size or architecture. In matrix computations, cache oblivious approaches are usually obtained from block-recursive approaches. In this article, we extend an existing cache oblivious approach for matrix operations, which is based on Peano space-filling curves, for multiplication of sparse and...
متن کامل